Queens, NY
• Identified areas of improvement in existing business by unearthing insights by analyzing vast amount of data using machine learning techniques.
• Working with business units (partners) to identify, prioritize, clearly define and document analytical needs, while maintaining perspective on the ultimate KPIs of the company
• Utilized analytical applications like R, SPSS, Rattle and Python to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into risk management and marketing strategies that drive value.
• Designed and implemented statistical / predictive models utilizing diverse sources of data to predict demand, risk and price elasticity
• Interpret problems and provides solutions to business problems using data analysis, data mining, optimization tools, and machine learning techniques and statistics
• Designed and deployed data science and technology based algorithmic solutions to address business needs for customer service business identify, understand and evaluate new commerce data technologies to determine the effectiveness of the solution and its feasibility of integration with all the product environment
• Conducted in-depth analysis and predictive modelling to uncover hidden opportunities; communicate insights to the product, sales and marketing teams
• Strong analytical and problem solving skills for solving the core issues underneath complexity and ambiguity. Excellent communication, analytical & troubleshooting skills
Authorized to work in the US for any employer
Data Scientist
January 2013 to December 2013
Responsibilities
Project: Credit Risk Model
The model was developed to identify risky bank loans using decision trees. This analysis helps banks to tighten the lending process by identifying risky loans. Decision tree algorithm was used in this model to identify factors to find the higher risk of default.
Role: Data Scientist
Responsibilities:
Participated in analytics initiatives to improve operational efficiency by utilizing business insights so obtained to Reduce Train Derailments, Increase Train Shipments, Big Emission Reductions, Customer (Shipper & Consignee) retention analysis, Improve Trip Building process for intermodal movement and cargo movement analysis.
Worked on programs for evaluating the data analytics opportunities to improve the rail road operations monitoring & trip building processes.
Identify additional data sources to augment the opportunities.
Extremely comfortable working with data, including managing large number of data sources, analyzing data quality, and pro-actively working with client's data/ IT teams to resolve data issues
Responsible for data identification, collection, exploration & cleaning for modeling, participate in model development
Capability to execute complex substantive analysis, including statistical modeling on large data sets.
Experienced in predictive analytics projects (i.e. supervised & unsupervised machine learning techniques.
Strong programming skills using R, Elastic Search & Machine Learning Algorithms
Designed and helped implement machine learning algorithms to enhance existing data mining capabilities
Used variety of analytical tools and techniques (regression, logistic, GLM, decision trees, machine learning etc.) to carry out analysis and derive conclusions
Knowledge of other relational database platforms such as Oracle, DB2, IMS-DB, NoSQL (Elastic Search) etc.
Visualize, interpret, report findings and develop strategic uses of data.
Help create and design reports that will use gathered metrics to infer and draw logical conclusions of past and future device behavior
Reformulate highly technical information into concise, understandable terms for presentation to the team and the client
Demonstrated aptitude for business problem identification, data collection and preparation, modeling, and problem solving
Used data to produce business insights using visualizations.
Write requirements for new analytic reports which will provide cross-functional support across business verticals
Maintain current knowledge of applicable business activities.
Support business users on data discovery, experimentation, hypothesis testing and develop subject matter expertise on functional side of the business.
Support the comprehensive and successful execution of the team's programs in an efficient and effective manner.
Lead and participate in ad hoc special projects and other initiatives, as requested.
Ensure all assigned activities are completed according to prescribed policies and standards.
Standard software development practices (source control, unit testing, automation and performance monitoring)
Contributed insights from conclusions of analysis that integrate with initial hypothesis and business objective.
Environment: Linux, Hadoop, Hive, MySQL, Spark, R, R-Studio, Tableau, Environment: R, SPSS, Machine Learning, Tableau, ggplot, Rattle, Linux, Elastic Search, SQL, Kibana
Technical Analyst Consultant
2006 to November 2012
• Created several Predictive models using Statistical modeling techniques for supporting Operations teams and supported Marketing Campaigns aimed at optimizing product portfolio and CRM initiatives.
• Extensively used SAS and SQL for extraction, transformation and loading of data from Large Scale RDBMS like Oracle and DB2.
• Conducted data manipulation using merging, appending, concatenating and sorting datasets in SAS.
• Created several applications for the purpose of Statistical Modeling and Data mining using SAS/Base, SAS/SQL, SAS/Stat, SAS/Graph and also automated applications using SAS/Macros.
• Involved in administration of data warehouse using Warehouse Administrator functionality of SAS
• Design and development of different data models according to user specifications in the development of databases for small applications.
TECHNICAL SKILLS
Big Data -Hadoop Ecosystem Cloudera Platform - Sqoop, Flume, Pig, Hive, Hbase and Spark
RDBMS Oracle 8.x/9.x/10.x, 11.x, MySQL 5.x, NoSQL- Mongo DB
ETL & BI Tools OWB, Oracle BI Tools, Tableau 7.x/8.x
Data Modeling Tools Oracle Data Modeler, Erwin
Programming Language R, Python
AWS - Cloud Computing
Machine Learning: Classification, Regression, Clustering, Feature Engineering
Operations Research: Optimization techniques like Linear, Integer Optimizations etc
Statistical Methods: Hypothesis testing & Confidence Intervals, Principal Component Analysis, Dimensionality Reduction
Prog. Languages: R, Python, SQL, PL/SQL, COBOL
Technologies: Azure Machine Learning, SPSS, Rattle, Unix
• Data Visualization: Qlikview, Tableau 8.0, ggplot2 (R)
• NoSQL: Elastic Search
• Cloud: OwnCloud
• DBMS: DB2, IMS-DB, Oracle
• Languages/Tools: Python (full SciPy stack), R, C/C++ (STL), Ruby (Rails), Bash, LaTeX
• Operating Systems: Unix (Linux, FreeBSD, OS X), Windows (XP, Vista, 7)
PhD in IT Management
December 2013
MBA in Master of Business Administration
May 2010
B.C.A. in Bachelor Of Computer Applications
May 2004